IndexToolkit: an open source toolbox to index protein databases for high-throughput proteomics
نویسندگان
چکیده
UNLABELLED A software package, IndexToolkit, aimed at overcoming the disadvantage of FASTA-format databases for frequent searching, is developed to utilize an indexing strategy to substantially accelerate sequence queries. IndexToolkit includes user-friendly tools and an Application Programming Interface (API) to facilitate indexing, storage and retrieval of protein sequence databases. As open source, it provides a sequence-retrieval developing framework, which is easily extensible for high-speed-request proteomic applications, such as database searching or modification discovering. We applied IndexToolkit to database searching engine pFind to demonstrate its effect. Experimental studies show that IndexToolkit is able to support significantly faster searches of protein database. AVAILABILITY The IndexToolkit is free to use under the open source GNU GPL license. The source code and the compiled binary can be freely accessed through the website http://pfind.jdl.ac.cn/IndexToolkit. In this website, the more detailed information including screenshots and documentations for users and developers is also available.
منابع مشابه
MSProGene: integrative proteogenomics beyond six-frames and single nucleotide polymorphisms
UNLABELLED Ongoing advances in high-throughput technologies have facilitated accurate proteomic measurements and provide a wealth of information on genomic and transcript level. In proteogenomics, this multi-omics data is combined to analyze unannotated organisms and to allow more accurate sample-specific predictions. Existing analysis methods still mainly depend on six-frame translations or re...
متن کاملLearning from Heterogeneous Data Sources: An Application in Spatial Proteomics
Sub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a ple...
متن کاملStatQuant: a post-quantification analysis toolbox for improving quantitative mass spectrometry
MOTIVATION Mass spectrometric protein quantitation has emerged as a high-throughput tool to yield large amounts of data on peptide and protein abundances. Currently, differential abundance data can be calculated from peptide intensity ratios by several automated quantitation software packages available. There is, however, still a great need for additional processing to validate and refine the q...
متن کاملEfficient visualization of high-throughput targeted proteomics experiments: TAPIR
MOTIVATION Targeted mass spectrometry comprises a set of powerful methods to obtain accurate and consistent protein quantification in complex samples. To fully exploit these techniques, a cross-platform and open-source software stack based on standardized data exchange formats is required. RESULTS We present TAPIR, a fast and efficient Python visualization software for chromatograms and peaks...
متن کاملObsessive-Compulsive Disorder Interactome Profile Analysis: A Perspective From Molecular Mechanism
Introduction: Obsessive-Compulsive Disorder (OCD) is one of the complex neuropsychiatric conditions. This disorder disables individuals in many different aspects of their personal and social life. Interactome analysis may provide a better understanding of this disorder’s molecular origin and its underlying mechanisms. Methods: In this study, the OCD-associated genes were extracted from the lit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 22 20 شماره
صفحات -
تاریخ انتشار 2006